Discriminative Training of Language Model

نویسندگان

Uwe Ohler

Stefan Harbeck

چکیده

We show how discriminative training methods, namely the Maximum Mutual Information and Maximum Discrimination approach, can be adopted for the training of N-gram language models used as clas-siiers working on symbol strings. By estimating the model parameters according to a discriminative objective function instead of Maximum Likelihood, the emphasis is not put on the exact modeling of each class, but on the right classiication of the samples. The methods are shown to be suited for a variety of applications, such as the recognition of regulatory DNA sequences and language identiication. Using phonotactic information, we achieve an error reduction of 10.7% (phoneme sequences) or 41.9% (code-book classes) with respect to the standard ML estimation on a corpus of English and German sentences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interdependence of Language Models and Discriminative Training

In this paper, the interdependence of language models and discriminative training for large vocabulary speech recognition is investigated. In addition, a constrained recognition approach using word graphs is presented for the efficient determination of alternative word sequences for discriminative training. Experiments have been carried out on the ARPA Wall Street Journal corpus. The recognitio...

متن کامل

Language Identification and Multilingual Speech Recognition Using Discriminatively Trained Acoustic Models

We perform language identification experiments for four prominent South-African languages using a multilingual speech recognition system. Specifically, we show how successfully Afrikaans, English, Xhosa and Zulu may be identified using a single set of HMMs and a single recognition pass. We further demonstrate the effect of language identification-specific discriminative acoustic model training ...

متن کامل

Discriminative Training and Support V Language Call Ro

In natural language call routing, callers are routed to desired departments based on natural spoken responses to an open-ended “How may I direct your call?” prompt. Natural language call classification can be performed using support vector machines (SVMs) or the popular vector-based model used in information retrieval. We recently demonstrate how discriminative training is powerful to improve a...

متن کامل

Discriminative training of language model classifiers

متن کامل

Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription

This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised di...

متن کامل

Discriminative Training of GMM for Language Identificatio..

In this paper, a discriminative training procedure for a Gaussian Mixture Model (GMM) language identification system is described. The proposal is based on the Generalized Probabilistic Descent (GPD) algorithm and Minimum Classification Error Rates formulated to estimate the GMM parameters. The evaluation is conducted using the OGI multi-language telephone speech corpus. The experimental result...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Discriminative Training of Language Model

نویسندگان

چکیده

منابع مشابه

Interdependence of Language Models and Discriminative Training

Language Identification and Multilingual Speech Recognition Using Discriminatively Trained Acoustic Models

Discriminative Training and Support V Language Call Ro

Discriminative training of language model classifiers

Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription

Discriminative Training of GMM for Language Identificatio..

عنوان ژورنال:

اشتراک گذاری